Two-Band Excitation for HMM-Based Speech Synthesis
نویسندگان
چکیده
© 2009 Seungho Han et al. 457 ABSTRACT⎯The optimum maximum voiced frequency (MVF) estimation-based two-band excitation for hidden Markov model-based speech synthesis is presented. An analysis-by-synthesis scheme is adopted for the MVF estimation which leads to the minimum spectral distortion of synthesized speech. Experimental results show that the proposed method significantly improves synthetic speech quality.
منابع مشابه
Improving Arabic HMM based speech synthesis quality
HMM based speech synthesis, where speech parameters are generated directly from HMM models, is a new technique relative to other speech synthesis techniques. In this paper, we propose some modifications to the basic system to improve its quality. We apply a multi-band excitation model. And we use samples extracted from the spectral envelop as spectral parameters. In the synthesis, the voiced an...
متن کاملParameterization of vocal fry in HMM-based speech synthesis
HMM-based speech synthesis offers a way to generate speech with different voice qualities. However, sometimes databases contain certain inherent voice qualities that need to be parametrized properly. One example of this is vocal fry typically occurring at the end of utterances. A popular mixed excitation vocoder for HMM-based speech synthesis is STRAIGHT. The standard STRAIGHT is optimized for ...
متن کاملTowards an improved modeling of the glottal source in statistical parametric speech synthesis
This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source signal in HMM-based speech synthesis systems. These systems generally use a pulse train to model the periodicity of the excitation signal of voiced speech. However, this model produces a strong and uniform harmonic structure throughout the spectrum of the excitation which makes the synthetic spe...
متن کاملSub-band text-to-speech combining sample-based spectrum with statistically generated spectrum
As described in this paper, we propose a sub-band speech synthesis approach to develop a high quality Text-to-Speech (TTS) system: a sample-based spectrum is used in the high-frequency band and spectrum generated by HMM-based TTS is used in the low-frequency band. Herein, sample-based spectrum means spectrum selected from a phoneme database such that it is the most similar to spectrum generated...
متن کاملAnalysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech Synthesis
This paper presents a study on the importance of shortterm spectral and excitation parameterizations for emotional hidden Markov model (HMM)-based speech synthesis. The analysis is performed through an emotion classification task by using two methods: K-means emotion clustering and Gaussian Mixture Models (GMMs)based emotion identification. Two known forms of parameterization for the short-term...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEICE Transactions
دوره 90-D شماره
صفحات -
تاریخ انتشار 2007